# Process and socket auditing with osquery

Enabling these auditing features requires additional configuration of osquery. osquery can leverage either BPF, Audit, OpenBSM or EndpointSecurity subsystems to record process executions and network connections in near real-time on Linux and macOS systems. Although these auditing features are extremely powerful for recording the activity from a host, they may introduce additional CPU overhead and greatly increase the number of log events generated by osquery.

To read more about how event-based tables are created and designed, check out the osquery [Table Pubsub Framework](../development/pubsub-framework.md).

Because different platforms have different choices for collecting real-time event data, osquery has multiple tables to present this information depending on the source and platform:

| Event type | osquery Table | Source | Supported Platform |
| -------------- | -------- | -------- | -------- |
| Process events | [`process_events`](https://osquery.io/schema/current#process_events)    | Audit (Linux), OpenBSM (macOS)  | Linux, macOS (10.15 and older) |
| Process events | [`bpf_process_events`](https://osquery.io/schema/current#bpf_process_events) | BPF | Linux (kernel 4.18 and newer) |
| Process events | [`es_process_events`](https://osquery.io/schema/current#es_process_events) | EndpointSecurity | macOS (10.15 and newer) |
| Socket events  | [`socket_events`](https://osquery.io/schema/current#process_events)      | Audit (Linux), OpenBSM (macOS) | Linux, macOS (10.15 and older) |
| Socket events  | [`bpf_socket_events`](https://osquery.io/schema/current#bpf_socket_events) | BPF | Linux (kernel 4.18 and newer) |

To collect process events, you would add a query like the following to your query schedule, or to a query pack:

```sql
SELECT * FROM process_events;
```

Each of these auditing features is enabled on a per-source basis using additional osquery configuration settings. Enabling any of them may have performance impact depending on the host activity, and should be tested in your environment before deployment. See the OS-specific sections for guidance.

## General Troubleshooting

Though some testing of underlying operating system configuration can
be performed via `osqueryi`; `osqueryi` and `osqueryd` operate
independently and do not communicate.

The `--verbose` flag can be really useful when trying to debug a problem.

### Examine configuration flags

To verify that osquery's flags are set correct, you can query the
`osquery_flags` table. For example, on a macOS machine, this shows
osquery will process OpenBSM events.

```sql
osquery> select * from osquery_flags where name in ("disable_events", "disable_audit");
+----------------+------+---------------------------------------------------+---------------+-------+------------+
| name           | type | description                                       | default_value | value | shell_only |
+----------------+------+---------------------------------------------------+---------------+-------+------------+
| disable_audit  | bool | Disable receiving events from the audit subsystem | true          | false | 0          |
| disable_events | bool | Disable osquery publish/subscribe system          | false         | false | 0          |
+----------------+------+---------------------------------------------------+---------------+-------+------------+
```

### Examine event table

osquery keeps state about the events subsystem in the `osquery_events`
table. The `events` column is of note here.

This example is from a macOS machine with events enabled, but no
events. You should try triggering an event, and then confirming that
the event count is non-0. If it remains at zero, the problem is likely
in how the OS auditing side is configured. See the platform specific
instructions.

```sql
osquery> select * from osquery_events;
+-------------------------+-----------------+------------+---------------+--------+-----------+--------+
| name                    | publisher       | type       | subscriptions | events | refreshes | active |
+-------------------------+-----------------+------------+---------------+--------+-----------+--------+
| diskarbitration         | diskarbitration | publisher  | 1             | 0      | 0         | 1      |
| event_tapping           | event_tapping   | publisher  | 1             | 0      | 0         | 0      |
| fsevents                | fsevents        | publisher  | 0             | 0      | 24        | 1      |
| iokit                   | iokit           | publisher  | 1             | 0      | 0         | 1      |
| openbsm                 | openbsm         | publisher  | 9             | 0      | 0         | 0      |
| scnetwork               | scnetwork       | publisher  | 0             | 0      | 0         | 0      |
| disk_events             | diskarbitration | subscriber | 1             | 0      | 0         | 1      |
| file_events             | fsevents        | subscriber | 0             | 0      | 0         | 1      |
| hardware_events         | iokit           | subscriber | 1             | 0      | 0         | 1      |
| process_events          | openbsm         | subscriber | 8             | 0      | 0         | 1      |
| user_events             | openbsm         | subscriber | 1             | 0      | 0         | 1      |
| user_interaction_events | event_tapping   | subscriber | 1             | 0      | 0         | 1      |
| yara_events             | fsevents        | subscriber | 0             | 0      | 0         | 1      |
+-------------------------+-----------------+------------+---------------+--------+-----------+--------+
```

## Linux process auditing using Audit

On Linux, osquery can the Audit system to collect and process events. It accomplishes this by monitoring syscalls such as `execve()` and `execveat()`. `auditd` should not be running when using osquery's process auditing, as it will conflict with `osqueryd` over access to the audit netlink socket. You should also ensure `auditd` is not configured to start at boot.

The only prerequisite for using osquery's auditing functionality on Linux is that you must use a kernel version that contains the Audit functionality. Most kernels over version 2.6 have this capability.

There is no requirement to install `auditd` or `libaudit`. Osquery only uses the audit features that exist in the kernel.

A sample log entry from process_events may look something like this:

```json
{
  "action": "added",
  "columns": {
    "uid": "0",
    "time": "1527895541",
    "pid": "30219",
    "path": "/usr/bin/curl",
    "auid": "1000",
    "cmdline": "curl google.com",
    "ctime": "1503452096",
    "cwd": "",
    "egid": "0",
    "euid": "0",
    "gid": "0",
    "parent": ""
  },
  "unixTime": 1527895550,
  "hostIdentifier": "vagrant",
  "name": "process_events",
  "numerics": false
}
```

To better understand how this works, let's walk through 4 configuration options. These flags can be set at the [command line](../installation/cli-flags.md) or placed into the `osquery.flags` file.

1. `--disable_audit=false` by default this is set to `true` and prevents osquery from opening the kernel audit's netlink socket. By setting it to `false`, we are telling osquery that we want to enable auditing functionality.
2. `--audit_allow_config=true` by default this is set to `false` and prevents osquery from making changes to the audit configuration settings. These changes include adding/removing rules, setting the global enable flags, and adjusting performance and rate parameters. Unless you plan to set all of those things manually, you should leave this as true. If you are configuring audit, using a control binary, or `/etc/audit.conf`, your osquery *may* override your settings.
3. `--audit_persist=true` but default this is `true` and instructs osquery to 'regain' the audit netlink socket if another process also accesses it. However, you should do your best to ensure there will be no other program running which is attempting to access the audit netlink socket.
4. `--audit_allow_process_events=true` this flag indicates that you would like to record process events

## Linux socket auditing using Audit

Osquery can also be used to record network connections by enabling `socket_events`. This table uses the syscalls `bind()` and `connect()` to gather information about network connections. This table is not automatically enabled when process_events are enabled because it can introduce considerable load on the system.

To enable socket events, use the `--audit_allow_sockets` flag.

A sample socket_event log entry looks like this:

```json
{
  "action": "added",
  "columns": {
    "time": "1527895541",
    "status": "succeeded",
    "remote_port": "80",
    "action": "connect",
    "auid": "1000",
    "family": "2",
    "local_address": "",
    "local_port": "0",
    "path": "/usr/bin/curl",
    "pid": "30220",
    "remote_address": "172.217.164.110"
  },
  "unixTime": 1527895545,
  "hostIdentifier": "vagrant",
  "name": "socket_events",
  "numerics": false
}
```

If you would like to log UNIX domain sockets use the hidden flag: `--audit_allow_unix`. This will put considerable strain on the system as many default actions use domain sockets. You will also need to explicitly select the `socket` column from the `socket_events` table.

The `success` column has been deprecated and replaced with `status`:
| Status value | Description |
|-|-|
| failed | Definitely failed |
| succeeded | Definitely succeeded |
| in_progress | The `connect`()` syscall has been marked as "in progress" (EINPROGRESS) and osquery can't determine whether it will succeed or not. Reserved for non-blocking sockets. |
| no_client | The `accept` or `accept4` syscall returned with EAGAIN since there were not incoming connections. Reserved for non-blocking sockets. |

The behavior of the socket_events table can be changed with the following boolean flags:

| Flag | Description |
|-|-|
| --audit_allow_sockets | Allow the audit publisher to install socket-related rules |
| --audit_allow_unix | Allow socket events to collect domain sockets |
| --audit_allow_failed_socket_events | Include rows for socket events that have failed |
| --audit_allow_accept_socket_events | Include rows for accept socket events |
| --audit_allow_null_accept_socket_events | Allow non-blocking accept() syscalls that returned EAGAIN/EWOULDBLOCK |

## Troubleshooting Audit-based process and socket auditing on Linux

There are a few different methods to ensure you have configured auditing correctly.

1. Ensure you are supplied all of the necessary flags mentioned above in either a command-line argument or in your flagfile.
2. Verify `auditd` is not running, if it is installed on the system.
3. Run `auditctl -s` if the binary is present on your system and verify that `enable` is not set to zero and the `pid` corresponds to a process for osquery
4. Verify that your osquery configuration has a query to `SELECT` from the process_events and/or socket_events tables
5. You may also run auditing using osqueryi **as root**:

```sh
osqueryi --audit_allow_config=true --audit_allow_sockets=true --audit_persist=true --disable_audit=false --events_expiry=1 --events_max=50000 --logger_plugin=filesystem  --disable_events=false
```

If you would like to debug the raw audit events as `osqueryd` sees them, use the hidden flag `--audit_debug`. This will print all of the RAW audit lines to osquery's `stdout`.

> NOTICE: Linux systems running `journald` will collect logging data originating from the kernel audit subsystem (something that osquery enables) from several sources, including audit records. To avoid performance problems on busy boxes (specially when osquery event tables are enabled), it is recommended to mask audit logs from entering the journal with the following command `systemctl mask --now systemd-journald-audit.socket`.

### Avoid throttling, losing events and interpreting Audit publisher throttling messages

If osquery is CPU constrained and is processing a high enough stream of events, you may receive this warning message:  
`The Audit publisher has throttled reading records from Netlink for <N> seconds. Some events may have been lost.`.

This message can only appear at most every minute and it indicates that the Audit publisher had to slow down reading records from the Netlink socket for the reported duration, since the previous throttling message.
This happens when osquery is not processing records fast enough to prevent its internal buffers growing too much and consuming too much memory.

Throttling may cause loss of events, since the Audit subsystem backlog buffer could fill up; if that happens the kernel will be forced to drop some of them.  
You can check if this is happening looking at the `lost` field via `auditctl -s`.

Throttling currently starts when more than 4096 records have been read and are still in the queue to be processed by osquery; this is a number of records
which can support high spikes of events, and is a limit for osquery to avoid consuming memory indefinitely.  
Keep in mind that if the high rate of events continues, even with throttling happening, you might still have to increase your default [watchdog memory limit](../installation/cli-flags.md#daemon-control-flags) or reduce the interval of the scheduled query on the evented table, due to the amount of rows that it will have to generate at once.

There's also a second throttling point in the Audit publisher pipeline, which exists after the records have been read from the Netlink socket and are then parsed into a more computer friendly format.  
When throttling happens here, another message will be logged which is:  
`The Audit publisher has throttled record processing for <N> seconds. This may cause further throttling and loss of events.`.

This message exists mostly for debugging purposes and will only appear if `--verbose` is active, because this doesn't necessarily cause loss of events: a bottleneck in this point of the pipeline will have to cause throttling in the Netlink socket reading side, before possibly causing loss of events.  
So as long as no throttling is happening on the reading side, no loss of events should happen due to this.

To avoid throttling there isn't much to be done beyond reducing constraints on the CPU or in general have osquery process less events.

To attempt avoiding losing events, first of all we should ensure that throttling happens as few times as possible. Then when can try to increase the backlog buffer that the Audit subsystem is using via the `--audit_backlog_limit` flag, to attempt to support bigger/slightly longer events spikes.  
Keep in mind that increasing this will increase the amount of memory used by the Audit subsystem and that this memory is not allocated by osquery, so it won't be accounted for by the watchdog.

## User event auditing with Audit

On Linux, a companion table called `user_events` is included that provides several authentication-based events. If you are enabling process auditing it should be trivial to also include this table.

## Linux process and socket auditing using BPF

When osquery is running on a recent kernel (>= 4.18), the BPF eventing framework can be used. This event publisher needs to monitor for more system calls to reach feature parity with the Audit-based tables. For this reason, enabling BPF will also enable both the `bpf_process_events` and `bpf_socket_events` tables.

In order to start the publisher and enable the subscribers, the following flags must be passed: `--disable_events=false --enable_bpf_events=true`. The `--verbose` flag can also be extremely useful when setting up the configuration for the first time, since it emit more debug information when something fails.

The BPF framework will make use of a perf event array and several per-cpu maps in order to receive events and correctly capture strings and buffers. These structures can be configured using the following command line flags:

- **bpf_perf_event_array_exp**: size of the perf event array, as a power of two
- **bpf_buffer_storage_size**: how many slots of 4096 bytes should be available in each memory pool

Memory usage depends on both:

 1. How many processors are currently online
 2. How many processors can be added by hotswapping

The BPF event publisher uses 6 memory pools, grouping system calls in order to evenly distribute memory usage. Not counting the internal maps used to merge sys_enter/sys_exit events (the size for these maps is rather small), memory usage can be easily estimated with the following formula:

```cpp
buffer_storage_bytes = memory_pool_count * (bpf_buffer_storage_size * 4096) * possible_cpu_count
```

```cpp
perf_bytes = (2 ^ bpf_perf_event_array_exp) * online_cpu_count
```

The cpu count numbers can be read from the `/sys` folder:

```text
possible_cpu_count: /sys/devices/system/cpu/possible
online_cpu_count: /sys/devices/system/cpu/online
```

VMware Fusion (and possibly other systems as well) supports CPU hotswapping, raising the `possible_cpu_count` to 128. This causes a huge increase in memory usage, and it is for this reason that the default settings are rather low.

This problem can be easily fixed by disabling hotswapping. This setting is unfortunately not available through the user interface, so it needs to be changed directly in the .vmx file (`vcpu.hotadd=FALSE`).

## macOS process & socket auditing

### Auditing processes with OpenBSM

To enable OpenBSM-based process auditing in osquery, set the following command-line flags:

- `--disable_audit=false`
- `--disable_events=false` 
- `--audit_allow_config`

**Note:**: macOS systems 10.15 and earlier ship with the OpenBSM subsystem enabled, but the default settings do not audit process execution or the root user. The osquery command-line flag `--audit_allow_config` will make run-time configuration changes to your system audit to enable these features. This is all you need to get up and running.

Alternatively, instead of using the `--audit_allow_config` flag, you may edit the `audit_control` file in `/etc/security/` for more granular/nuanced needs. This is optional and considered an "advanced configuration". An example configuration is provided below, but the important flags are: `ex`, `pc`, `argv`, and `arge`. The `ex` flag will log `exec` events while `pc` logs `exec`, `fork`, and `exit`. If you don't need `fork` and `exit` you may leave that flag out however in the future, getting parent pid may require `fork`. If you care about getting the arguments and environment variables you also need `argv` and `arge`. More about these flags can be found [here](https://www.freebsd.org/cgi/man.cgi?apropos=0&sektion=5&query=audit_control&manpath=FreeBSD+7.0-current&format=html). Note that it might require a reboot of the system for these new flags to take effect. `audit -s` should restart the system but your mileage may vary.

**Note:** Prior to macOS 10.15, OpenBSM was the primary source of real-time audit events. Since macOS 10.15, EndpointSecurity has been available as a newer alternative [and eventual replacement to the now-deprecated OpenBSM](https://developer.apple.com/videos/play/wwdc2020/10159/). However, with osquery, you can collect events from either of these sources.

### Auditing processes with EndpointSecurity

To enable EndpointSecurity in osquery, set `--disable_endpointsecurity=false` in the configuration.

EndpointSecurity is already enabled in the OS on all macOS hosts beginning with macOS 10.15, and needs no special configuration. There are however some additional steps to permit osquery to collect events.

For osquery to capture events in its `es_process_events` table, it must have the Full Disk Access (FDA) permission enabled in macOS Privacy & Security settings. Without this permission, osquery will run as normal, but the table will always be empty. **Note:** If osquery is already running without the permission, it must be restarted after you have granted the permission.

**If osquery is not granted the FDA permission, it will not prompt the user to grant it.** It will just issue a warning (when running with `--verbose`), and the `es_process_events` table will simply be empty when queried.

#### Full Disk Access

The FDA permission (or lack thereof) is inherited from `Terminal.app` when running osquery interactively, but is *not* inherited from `launchctl` when running as a service (including when started using the `osqueryctl` helper script).

| Parent Process | Steps Taken Before Launching osquery | Querying `es_process_events` |
| -------- | -------- | -------- |
| `Terminal.app`¹   | Give Full Disk Access to `Terminal.app` only  | Success     |
| `Terminal.app`¹  | Give FDA only to osquery only, or do nothing  |  No events |
| `launchctl`  | Give Full Disk Access to `/opt/osquery/lib/osquery.app/Contents/MacOS/osqueryd`² only | Success |
| `launchctl`  | Give FDA to `launchctl` only, or do nothing  | No events  |

¹ : if you use a third-party terminal emulator like `iTerm.app`, grant that the permission instead of `Terminal.app`.

² : whether running osquery via `osqueryi`, `osqueryd`, or `osqueryd -S`, the permissions will be the same in each case.

##### Manually Granting Permissions

To manually enable FDA permissions for an executable: open System Preferences, go to Security & Privacy, select the Privacy tab, and find Full Disk Access item on the left side. Unlock the System Preferences pane (lower left side lock icon) and enter your credentials. On the right side, clicking the `+` icon adds a new entry to the list, and you can select the executable to be granted this permission. The Finder-based file browser doesn't see paths like `/usr` by default, but you can either drag-and-drop the executable from another Finder window, or you can begin typing with `/` and enter the path explicitly. **Note:** the executable must already exist at that path before it can be manually granted the permission this way.

##### Automatically Granting Permissions (Silent Installs)

If a macOS host is enrolled in MDM, The FDA permissions can be granted silently by pushing a "PPPC payload" configuration profile (Privacy Preferences Policy Control) that sets the `SystemPolicyAllFiles` (*i.e.*, the FDA) key. A PPPC payload silently sets permissions, provided with an executable identifier called the  `CodeRequirement`.

To get the appropriate `CodeRequirement` identifier, use the `codesign` tool and then copy everything in the output after the `designated =>`.

```shell
> codesign  -dr - /opt/osquery/lib/osquery.app/Contents/MacOS/osqueryd
Executable=/opt/osquery/lib/osquery.app/Contents/MacOS/osqueryd
designated => identifier "io.osquery.agent" and anchor apple generic and certificate 1[field.1.2.840.113635.100.6.2.6] /* exists */ and certificate leaf[field.1.2.840.113635.100.6.1.13] /* exists */ and certificate leaf[subject.OU] = "3522FA9PXF"
```

For your deployment, either generate an equivalent profile using your MDM dashboard (specifying `/usr/local/bin/osqueryd` as `Identifier` and `path` as the `Identifier Type` and setting `SystemPolicyAllFiles` to `Allow`), or just use the example configuration profile below, ensuring the correct value for the following fields:

- `PayloadOrganization` (your organization)
- `CodeRequirement` (see above)

```xml
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
 <key>PayloadContent</key>
 <array>
  <dict>
   <key>PayloadDescription</key>
   <string>osqueryd</string>
   <key>PayloadDisplayName</key>
   <string>osqueryd</string>
   <key>PayloadIdentifier</key>
   <string>BDBD19F2-A35A-4AEC-9E96-3CA7E2994666</string>
   <key>PayloadOrganization</key>
   <string>Trail of Bits</string>
   <key>PayloadType</key>
   <string>com.apple.TCC.configuration-profile-policy</string>
   <key>PayloadUUID</key>
   <string>89121197-3B5F-4502-BB8C-4331261D3B8C</string>
   <key>PayloadVersion</key>
   <integer>1</integer>
   <key>Services</key>
   <dict>
    <key>SystemPolicyAllFiles</key>
    <array>
     <dict>
      <key>Allowed</key>
      <true/>
      <key>CodeRequirement</key>
      <string>identifier "io.osquery.agent" and anchor apple generic and certificate 1[field.1.2.840.113635.100.6.2.6] /* exists */ and certificate leaf[field.1.2.840.113635.100.6.1.13] /* exists */ and certificate leaf[subject.OU] = "3522FA9PXF"</string>
      <key>Comment</key>
      <string></string>
      <key>Identifier</key>
      <string>io.osquery.agent</string>
      <key>IdentifierType</key>
      <string>bundleID</string>
     </dict>
    </array>
   </dict>
  </dict>
 </array>
 <key>PayloadDescription</key>
 <string>osqueryd</string>
 <key>PayloadDisplayName</key>
 <string>osqueryd</string>
 <key>PayloadIdentifier</key>
 <string>BDBD19F2-A35A-4AEC-9E96-3CA7E2994666</string>
 <key>PayloadOrganization</key>
 <string>Trail of Bits</string>
 <key>PayloadScope</key>
 <string>System</string>
 <key>PayloadType</key>
 <string>Configuration</string>
 <key>PayloadUUID</key>
 <string>28A8A2B7-A91E-4C26-BAEC-00F6F542742E</string>
 <key>PayloadVersion</key>
 <integer>1</integer>
</dict>
</plist>
```

### Auditing processes and sockets with OpenBSM

To enable OpenBSM in osquery, set `--disable_audit=false` in the configuration.

OpenBSM is already enabled in the OS on all macOS installations, but with its default settings it doesn't audit process execution or the root user. To start process auditing on macOS, edit the `audit_control` file in `/etc/security/`. An example configuration is provided below, but the important flags are: `ex`, `pc`, `argv`, and `arge`. The `ex` flag will log `exec` events, while `pc` logs `exec`, `fork`, and `exit`. If you don't need `fork` and `exit` you may leave that flag out. However, in the future, getting the parent pid may require `fork`. If you care about getting the arguments and environment variables, you also need `argv` and `arge`. More about these flags can be found [here](https://www.freebsd.org/cgi/man.cgi?apropos=0&sektion=5&query=audit_control&manpath=FreeBSD+7.0-current&format=html). Note that it might require a reboot of the system for these new flags to take effect. `audit -s` should restart the system, but your mileage may vary.

```text
#
# $P4: //depot/projects/trustedbsd/openbsm/etc/audit_control#8 $
#
dir:/var/audit
flags:ex,pc,ap,aa,lo,nt
minfree:5
naflags:no
policy:cnt,argv,arge
filesz:2M
expire-after:10M
superuser-set-sflags-mask:has_authenticated,has_console_access
superuser-clear-sflags-mask:has_authenticated,has_console_access
member-set-sflags-mask:
member-clear-sflags-mask:has_authenticated
```

## osquery events optimization

This section provides a brief overview of common and recommended
optimizations for event-based tables. These optimizations also apply
to the FIM events.

1. `--events_optimize=true` apply optimizations when `SELECT`ing from events-based tables, enabled by default.
2. `--events_expiry` the lifetime of buffered events in seconds with a default value of 86000.
3. `--events_max` the maximum number of events to store in the buffer before expiring them with a default value of 1000.

The goal of optimizations are to protect the running process and system from impacting performance. By default these are all enabled, which is good for configuration and performance, but may introduce inconsistencies on highly-stressed systems using process auditing.

Optimizations work best when `SELECT`ing often from event-based tables. Otherwise the events are in a buffered state. When an event-based table is selected within the daemon, the backing storage maintaining event data is cleared according to the `--event_expiry` lifetime. Setting this value to `1` will auto-clear events whenever a `SELECT` is performed against the table, reducing all impact of the buffer.
